A Comparative Study of Stemming Algorithms
نویسنده
چکیده
Stemming is a pre-processing step in Text Mining applications as well as a very common requirement of Natural Language processing functions. In fact it is very important in most of the Information Retrieval systems. The main purpose of stemming is to reduce different grammatical forms / word forms of a word like its noun, adjective, verb, adverb etc. to its root form. We can say that the goal of stemming is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. In this paper we have discussed different methods of stemming and their comparisons in terms of usage, advantages as well as limitations. The basic difference between stemming and lemmatization is also discussed. Keywordsstemming; text mining; NLP; IR; suffix
منابع مشابه
Stemming Algorithms: A Comparative Study and their Analysis
Stemming is an approach used to reduce a word to its stem or root form and is used widely in information retrieval tasks to increase the recall rate and give us most relevant results. There are number of ways to perform stemming ranging from manual to automatic methods, from language specific to language independent each having its own advantage over the other. This paper represents a comparati...
متن کاملOPTIMAL SIZE AND GEOMETRY DESIGN OF TRUSS STRUCTURES UTILIZING SEVEN META-HEURISTIC ALGORITHMS: A COMPARATIVE STUDY
Meta-heuristic algorithms are applied in optimization problems in a variety of fields, including engineering, economics, and computer science. In this paper, seven population-based meta-heuristic algorithms are employed for size and geometry optimization of truss structures. These algorithms consist of the Artificial Bee Colony algorithm, Cyclical Parthenogenesis Algorithm, Cuckoo Search algori...
متن کاملNew stemming for arabic text classification using feature selection and decision trees
In this paper we conduct a comparative study between two stemming algorithms: khoja stemmer and our new stemmer for Arabic text classification (categorization), using Chisquare statistics as feature selection and focusing on decision tree classifier. Evaluation used a corpus that consists of 5070 documents independently classified into six categories: sport, entertainment, business, middle east...
متن کاملA COMPARATIVE STUDY FOR THE OPTIMAL DESIGN OF STEEL STRUCTURES USING CSS AND ACSS ALGORITHMS
In this article, an Advanced Charged System Search (ACSS) algorithm is applied for the optimum design of steel structures. ACSS uses the idea of Opposition-based Learning and Levy flight to enhance the optimization abilities of the standard CSS. It also utilizes the information of the position of each charged particle in the subsequent search process to increase the convergence speed. The objec...
متن کاملA Comparative Study of Some Clustering Algorithms on Shape Data
Recently, some statistical studies have been done using the shape data. One of these studies is clustering shape data, which is the main topic of this paper. We are going to study some clustering algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and ...
متن کامل